Solving Two-class Classification Problem Using Adaboost
نویسندگان
چکیده
This paper presents a learning algorithm based on AdaBoost for solving two-class classification problem. The concept of boosting is to combine several weak learners to form a highly accurate strong classifier. AdaBoost is fast and simple because it focuses on finding weak learning algorithms that only need to be better than random, instead of designing an algorithm that learns deliberately over the entire space. We evaluated algorithms using Breast Cancer Wisconsin dataset which consists of 699 patterns with 9 attributes. It aims at assisting medical practitioners in breast cancer diagnosis. Thus the class output is the diagnosis prediction which is either benign or malignant. For comparison, back propagation neural network (BPNN) is developed and implemented on the same database. Experimental results show that AdaBoost is able to outperform BPNN under same experimental condition.
منابع مشابه
ADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION
With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...
متن کاملMulti-class AdaBoost
Boosting has been a very successful technique for solving the two-class classification problem. In going from two-class to multi-class classification, most algorithms have been restricted to reducing the multi-class classification problem to multiple two-class problems. In this paper, we develop a new algorithm that directly extends the AdaBoost algorithm to the multi-class case without reducin...
متن کاملImproving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering
Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...
متن کاملMulticlass Boosting with Adaptive Group-Based kNN and Its Application in Text Categorization
AdaBoost is an excellent committee-based tool for classification. However, its effectiveness and efficiency in multiclass categorization face the challenges from methods based on support vector machine SVM , neural networks NN , naı̈ve Bayes, and k-nearest neighbor kNN . This paper uses a novel multi-class AdaBoost algorithm to avoid reducing the multi-class classification problem to multiple tw...
متن کاملA robust multi-class AdaBoost algorithm for mislabeled noisy data
AdaBoost has been theoretically and empirically proved to be a very successful ensemble learning algorithm, which iteratively generates a set of diverse weak learners and combines their outputs using the weighted majority voting rule as the final decision. However, in some cases, AdaBoost leads to overfitting especially for mislabeled noisy training examples, resulting in both its degraded gene...
متن کامل